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advent of deep learning techniques, it is possible to forecast solar irradiance accu- 
rately for a longer time. In this paper, day-ahead solar irradiance is forecasted using 
encoder-decoder sequence-to-sequence models with attention mechanism. This study 
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sequence-to-sequence model and compared with smart persistence (SP), back prop- 
agation neural network (BPNN), recurrent neural network (RNN), long short term 
Sequence-to-sequence LSTM memory (LSTM) and encoder-decoder sequence-to-sequence LSTM with attention 
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is more accurate and has reduced forecast error of 31.1%, 19.3% and 8.5% respectively 
for day-ahead solar irradiance forecast with 31.07% as forecast skill. 
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1. INTRODUCTION 

Integration of solar electricity known as distributed energy resources (DERs) into power grid has 
gained a rapid development in recent years due to reduction in manufacturing cost and increased efficiency of 
photovoltaic (PV) panels. The amount of electricity that can be generated from DERs is always a stochastic in 
nature because of its dependency on weather parameters. This further leads to a challenge for grid operators 
in estimating generation, distribution and scheduling of power generation. Therefore, an accurate day-ahead 
forecast of solar irradiance with big data and deep learning model solves this problem. 

Forecast models in literature for solar irradiance are persistence model, physical model and statistical 
model. Very short-term forecast (seconds to less than 30 minutes) is popularly predicted with persistence model 
(1), [2]. As accuracy of persistence model decreases with increase in forecast horizon, it is not preferred for 
24 hours day-ahead forecast. In physical model or numerical weather prediction models [1], [3], the state of 
the atmosphere is described by mathematical equations which require numerical methods to solve. Forecast 
employed with physical model leads to erroneous result for sudden change in values of meterological variables 
such as relative humidity, wind speed and wind direction. Artificial neural network (ANN) based multilayer 
perceptron model [4], [5] with Levenberg-Marquardt algorithm was proposed to forecast 24 hours ahead solar 
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irradiance and found that the usage of meterological parameters as input variables gives more accuracy in 
forecast. Input variables with higher dimension [6]-[9] (up to 900 inputs) are used with ANN models of 
different architecture to predict short term global solar irradiance of 20% reduction in errors. Deep learning 
models are the subset of machine learning and these models on solar irradiance forecast results with higher 
accuracy comapared to machine learning models. A method of day-ahead solar irradiance forecast using long 
short-term memory (LSTM) network with weather variables as feature vectors was developed and results 
prove that LSTM outperforms all the other conventional forecast methods in terms of forecast accuracy. Jeon 
et al. proposed an LSTM based deep learning model for solar irradiance forecast with weather variables 
and also solar irradiance of the previous day as feature vectors. Simulation result shows the improvement in 
forecast accuracy if solar irradiance of previous day is also used as input feature. Gao et al. proposed gated 
recurrent unit (GRU) based model for hourly day-ahead solar irradiance forecast using weather variables. 

In this paper, studies are made to forecast day-ahead solar irradiance using LSTM based encoder- de- 
coder models with attention mechanism. Intially, datas are cleaned and converted into structured multivariate 
problem to train with encoder-decoder sequence-to-sequence models. Based on pearson correlation coeffi- 
cient, input variables are selected from the list of meteorological parameters. Comprehensive experiments are 
made to determine the forecast accuracy considering meterological parameters as input variable. Experiments 
have shown that LSTM based encoder-decoder sequence-to-sequence models with attention mechanism have 
reduced errors comparatively. Forecast horizon from the perspective of decision making activity in mi- 
crogrid or smartgrid are classified as very short-term forecast, short-term forecast, medium-term forecast and 
long-term forecast. Very short-term forecast is used in real time monitoring of photovoltaic power and the 
forecast horizon is from few seconds to minutes ahead. Short-term forecast is used in decsion making applica- 
tions involved in power system operation such as economic dispatch, unit commitment. Forecast horizon for 
short-term forecast is up to 48 to 72 hours ahead. Schedule and maintenance of power plant are planned with 
medium-term forecast and its horizon is upto one week ahead. Long term forecast helps in the assessment of 
solar energy and its horizon is from months to years. Unit commitment [15], for power plants such as 
biomass, nuclear, and coal, are one day-ahead and for power plants such as gas and oil are hour ahead. This 
time horizon is formulated depending on their startup and shutdown times. In such a case with renewable 
integration into grid, unit commitment and economic dispatch decisions vary depending on solar forecasts. 

In this paper, day-ahead solar irradiance is forecasted using different deep-learning techniques. In day- 
ahead forecast previous day’s data is used as input to forecast irradiance of next 11 hours with a resolution of one 
hour. In general, geographical locations also determine the forecast error and hence the models described 
here are tested for different locations with different climatic conditions also. This paper is organised as follows: 
methodology is described in section 2, description of data and preprocessing in section 3, experiments and 
results in section 4 and conclusion and future work in section 5. 


2. METHOD 

Long short term memory network is base for all the other models. Hence, architecture of LSTM and 
LSTM based encoder-decoder sequence-to-sequence models with attention mechanism and a benchmark algo- 
rithm are described in detail. Under benchmark algorithm, smart persistence is used to compare the proposed 
method. 


2.1. Smart persistence-benchmark algorithm 

Forecast error varies with dataset, location and horizon. Hence for a good comparison, benchmarking 
algorithm such as smart persistence model (SP) or scaled persistence model is suggested. Smart persistence 
model suggest that the predicted value at the next moment G(t + h) is the product of clear-sky index ke,(t) 
and clear-sky irradiance at next moment Ges(t + h). 


healt) = Boh (1) 
G(t +h) = kes(t)Ges(t + h) (2) 


where k.s(t) is the clear-sky index , Ges (t) is the clear-sky irradiance and h in (2) is the forecast horizon. 
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2.2. Encoder-decoder sequence-to-sequence architecture 

Traditional neural network like back propagation neural network (BPNN) , do not have memory to 
understand and process sequential data. This was overcome by recurrent neural network (RNN) algorithm [19]. 
RNNs have loops within them and makes the informations to persist. However RNN suffers from vanishing and 
exploding gradient problems that prevents it from learning large sequences. Hochreiter et al. proposed LSTM 
[20] network that can process sequential data effectively with recurrent neural network as shown in Figure[I] 


Figure 1. LSTM cell structure 


Input variables of a single LSTM units are current time step input vector X;, output of the previous 
LSTM unit h;_; and memory of the previous LSTM unit also called cell state c;_;. The outputs of a single 
LSTM unit are output of the hidden layer h; and memory at the current time step c+. Each LSTM unit processes 
the information through forget gate (f+), input gate (i+) and output gate (0;) according to (3). and (5). 


fi = OW «fori T Wh forht—1 T Dior) (3) 
Ut = O(Werinp Xt + Whinpht-1 a Dinn) (4) 
Ot = O(WroutXt T Whowmhht=i = Dout) (5) 


Where W:2 for, Whfor are forget gate’s weight matrix, Winn, Whinp are input gate’s weight matrix 
and Wrout, Whout are output gate’s weight matrix; bfor, bDinp and bout are bias values of forget gate, input 
gate and output gate respectively. o represents sigmoid activation function. Forget gate (f+) decides, which 
part of the informations are to be erased and which part of the informations are to be retained and outputs a 
number between 0 and 1 through sigmoid function. Input gate (i+) and forget gate (f+) specifies the part of the 
informations to be added with the cell state. Finally, output gate (0+) decides the information output from cell 
state. Cell state c; and current output of hidden layer are calculated by (6) and (7), 


Ct = fi 9 Ct—1 T lt ©) tanh(WoecellXt T Wheettht—1 ar Deel) (6) 


hi = o © tanh(cz) (7) 


where © represents the hadamard product that performs element-wise matrix multiplication. 

Encoder-decoder sequence-to-sequence architecture uses LSTM (Enc-Dec-LSTM) as encoder com- 
ponent, Luong’s attention layer, another LSTM network as decoder component and a dense layer as shown in 
Figure [2] Encoder-decoder sequence-to-sequence architecture although developed for natural language trans- 
lation, it had been succesfully applied for time-series forecasting such as air-quality, and traffic prediction. 
Encoder encodes the information from input into a fixed length vector. The final outputs of the encoder are 
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discarded and the internal state and hidden state combinedly called fixed length vector is fed into the decoder. 
Decoder is also given previous hour of the target and trained to predict next hour. This process of training is 
called teacher-forcing. 


Gee ed 


ENCODER 


Figure 2. Encoder-decoder sequence-to-sequence architecture 


Attention Layer 


Encoder hidden states Decoder hidden states 


Figure 3. Attention layer 


2.3. Attention mechanism 

According to Luong et al. the potential issue of the encoder is, by compressing all necessary 
information of input into a fixed-length vector may fail to generate long sequence from the decoder. Attention 
layer as shown in Figure [3] allows the model to access all the past hidden states of encoder instead of the last 
hidden layer alone. The alignment score e+; for Luong’s attention is calculated as in (8), 


eri = hay hi (8) 
expleri) 
maon - (9) 
> j=1 ETP(et,j) 
N 
= a (10) 
i=1 


where hg; is current target state or t*” hidden state of decoder and h; is it” hidden state of encoder. Attention 
weight a+; as in (9) is calculated by softmaxing the alignment score to sum up to 1. Context vector as in 
is computed by element-wise multiplication of ith hidden state of encoder and attention weight. The context 
vector is then concatenated with current target state hg; and is fed into a fully connected feed-forward network 
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(FFN). Computation time [10], for these networks are not critical as training is offline but the forecast 
using trained network is fast. 


3. DESCRIPTION OF DATA AND PREPROCESSING 
3.1. Dataset 

Solar irradiance data can be obtained either from a measuring instrument installed at site or through 
satellite derived irradiance dataset. Though satellite derived dataset is less accurate compared to a dataset col- 
lected from a measuring instrument, satellite derived dataset is often used by researchers [24], because 
of its open access, ease of use, wide temporal and spatial coverage and almost no data is missed. The data 
set containing real-world meterological values are collected from the National Renewable Energy Laboratory’s 
(NREL), National Solar Radiation Database (NSRDB) for New Delhi, India. Hourly data of global hori- 
zontal irradiance (GHI), temperature, pressure, relative humidity, wind direction and wind speed are obtained 
from the year 2009 to 2015. Solar irradiance exists only during daytime and hence the hours between 7:00 AM 
and 5:00 PM are considered. After analysing the dataset, solar irradiance peaks in the month of April and May 
comparatively for selected location and this shows its seasonal behaviour. 
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Pressure 
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Wind Direction 


Wind Speed 


-0.6 


GHI Temperature Pressure Relative Humidity Wind Direction Wind Speed 


Figure 4. Heat map with correlation coefficient between input variables 


3.2. Data normalization 

The datas loaded into neural network are normalized in the range of [0, 1]. According to d; is the 
data before normalization, d; is the data after normalization, dmin and dma, are the minimum and maximum 
value of the variable. The aim of data normalization is to convert the numeric values in dataset to a common 
value. 


i a ee. (11) 
3.3. Correlation 


Linear relationship between two variables are measured commonly with the Pearson’s correlation 
coefficient. Correlation between solar irradiance and other weather variables are measured using Pearson’s 
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correlation coefficient as shown in Figure|4] From the analyses of Pearson’s correlation coefficient, temperature 
is found to be positively correlated with GHI and relative humidity is found to be negatively correlated with 
GHI. In literature, Evans and Denes et al. classified the absolute value for correlation factor as very 
weak if value is between 0 and 0.19, weak if value is between 0.20 and 0.39, moderate if value is between 0.40 
and 0.59, strong if value is between 0.6 and 0.79 and very strong if value is between 0.8 and 0.99. As per the 
above classification wind direction and wind speed can be neglected as their correlation is very weak with GHI. 
Sliding window technique is used in preprocessing of data. 


4. EXPERIMENTS AND RESULTS 
4.1. Training and testing data 

The data from January 2009 to December 2013 are taken as training set and the data from 2014 are 
taken as test set. Training and validation data are split using test train splitter which devotes 80% of data to train 
and remaining 20% of data for validation. Input data with weather variables are in different range of values. 
Hence datasets are rescaled to lie in the range of [0, 1] and it is called normalization of datasets. Datasets are 
normalized using MinMaxScaler in scikit-learn according to (i ip. 


4.2. Metrics 


Standard statistical measures such as root mean square error (RMSE) , and mean absolute error (MAE) 
are commonly used to measure the accuracy of forecast model [29], 


1 n 
RMSE = m > Morea _ Von (12) 


1 i=n 
MAE = > D epee — Yoctmali (13) 


where Yprea is the predicted irradiance value and Yoctuqi is the actual irradiance value. To have a good com- 
parision, forecast skill (FS) [18] is one of the most recommended metric in the world of forecast, where SP in 
[T4]is smart persistence. 


RM S Egroposd 


ForecastSkill = 1 — RMSEsp 


(14) 


4.3. Experiments 


Experiments described here uses Keras version 2.3.1 to implement BPNN, RNN, LSTM and LSTM 
based encoder-decoder sequence-to-sequence model with attention. Hyper-parameter for the above models 
are tuned based on grid-search method. BPNN has 55 units and 95 units in hidden layer! and hidden layer2 
respectively whereas RNN has 95 units and 105 units, LSTM has 85 units and 125 units in their repective hidden 
layer! and hidden layer2. Encoder and decoder layer has each 95 units in Enc-Dec-LSTM network. Dropout 
of 0.2 is used in each of input layers as a regularisation technique. Adam optimiser is used for optimization as 
it combines the best features of RMSprop and AdaGrad and batch size is set as 100 from grid-search method. 
Smart persistence model is free of training and tuning of parameters. 


4.3.1. Forecast results 


Forecast is performed with temperature, relative humidity and pressure as input variables and Enc- 
Dec-LSTM model is compared with LSTM, RNN, BPNN and SP. Clearsky GHI is used in smart persistence to 
forecast day-ahead irradiance. The hourly input variables from 8:00 am to 5:00 pm are considered and therefore 
for a day, 10 timesteps are accounted. Different lagging time from 10 hour to 22 hour are tested and found the 
model results with least error for a 10 hour lagging time. Thus for day-ahead forecast, previous day’s 10 hours 
of data is given as input to predict next day’s 10 hours of solar irradiance with a resolution of one hour. 
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Table 1. Performance comparision of different algorithms 


Algorithm 
Enc-Dec-LSTM 
LSTM 

RNN 

BPNN 

SP 


RMSE (W/m?) 
100.57 
104.52 
109.95 
124.67 
145.91 


MAE (W/m?) 
60.27 
61.88 
64.37 
104.13 
79.77 


FS(%) 
31.07 
28.37 
24.64 
14.56 

0 


Ht 

I 7 

' 

I 

l | — -Enc-Dec 
f 


Forecast results in terms of error metric are shown in Table|]] Enc-Dec-LSTM outperforms the other 
models and compared to SP, BPNN and RNN, RMSE is reduced by 31.1%, 19.3%, 8.5% respectively and MAE 
is reduced by 24.4%, 42.1%, 6.4% respectively. Less forecast skill indicates that the models performance is 
almost same as that of smart persistence model. Enc-Dec-LSTM model has the highest forecast skill of 31.07% 
which indicates that the model performs better than any other model compared here. As shown in Figure [5] 
Enc-Dec-LSTM model’s forecast is nearer to actual data even on a cloudy day and hence its overall error is less 


compared to the other models. 


Average monthly RMSE of the test dataset is shown in Table[2]and its seen that the error peaks during 
Monsoon season. As shown in Figure |4| GHI is highly correlated with temperature variable and thus the 
monthly correlation of temperature with GHI is tested on the test dataset. The correlation of temperature with 
GHI during the months July, August and September are low which results with highest error during Monsoon 


Table 2. Average monthly RMSE (W/m?) and MAE (W/m?) of test dataset 


season. 
Error Algorithm Jan Feb Mar 
RMSE Enc-Dec 
91.41 127.82 121.57 
-LSTM 
LSTM 98.95 137.23 124.82 
RNN 123.18 155.47 130.17 
BPNN 98.76 133.35 136.35 
MAE Enc-Dec 
59.52 86.20 79.86 
-LSTM 
LSTM 60.85 85.63 76.18 
RNN 69.90 92.07 80.70 
BPNN 71.29 103.27 106.79 


Apr May 
81.42 96.69 
80.61 99.17 
82.13 98.71 
114.75 121.88 
50.87 56.02 
47.38 59.34 
52.60 61.95 
99.72 105.91 
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Jun Jul 
105.02 131.45 
104.74 136.11 
102.13 138.43 
114.53 137.24 

58.64 88.21 
64.02 93.31 
62.73 93.05 
93.45 108.85 


Aug 
143.63 


145.98 
153.29 
153.98 


95.95 


96.05 
101.06 
122.92 
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Sept 
102.66 


106.86 
108.72 
123.97 


63.06 


67.22 
70.23 
97.12 


Oct 
48.68 


54.82 
51.37 
73.09 


30.32 


36.06 
32.22 
59.49 
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4.3.2. Forecast results at different location 

In addition, the geographical location and climatic conditions also determine the forecast accuracy 
and hence a test is made on three different location with different climatic conditions to study and compare 
the feasibility of Enc-Dec-LSTM model. LSTM and Enc-Dec-LSTM models are compared for the datasets 
collected from NSRDB at different locations for different climatic conditions according to Koppen-Geiger 
climate classification. The data from year 2009 to 2013 are set as training dataset and 2014 as testing dataset. 
Table[3}lists the day-ahead RMSE of LSTM and Enc-Dec-LSTM in which Enc-Dec-LSTM has least error in all 
different locations with different climatic conditions. According to Koppen-Geiger climate classification Csa, 
Bsh and Aw as listed in Table [3] denotes hot-summer mediterranean climate, hot semi-arid (steppe) climate, 
tropical savanna wet climate respectively. 


Table 3. Day-ahead RMSE of forecast model at different locations 
Latitude Longitude Climate LSTM RMSE (W/m?) — Enc-Dec-LSTM RMSE (W/m?) 


23.25 71.35 Csa 101.42 98.47 
26.25 73.05 Bsh 88.44 85.5 
22.65 88.45 AW 121.46 117.55 


4.3.3. Comparision with recently published papers 

A comparision of recently published works in one day-ahead solar irradiance forecast is made in Table 
Emerging deep learning techniques shows great improvement in accuracy for day-ahead solar irradiance 
forecast. Forecast error can also depend on geographical location and climatic condition and therefore forecast 
skill as developed by Yang can be the best reference to compare with other models. As per forecast skill 
comparision in Table |4| LSTM based encoder-decoder sequence-to-sequence with attention mechanism has 
highest skill of 31.07% and thus it outperforms the other models. 


Table 4. Comparision of day-ahead solar irradiance with recently published works 


Author Algorithm Location RMSE FS(%) 
Larson et al. LSO? and NWP? San Diego, USA 27.5 % 24 
Aryaputera et al. WRF“ and ETS? Station 500, Singapore 188 (W/m?) 12.9 as per|14 
Hai et al. DFT* Qingdao, China 127.3 (W/m?) 6.3 

Qing and Niu et al. LSTM Cape Verde, Santiago 122.72 (W/m?) — 

Gao et al. GRU Denver, USA 122.45 (W/m?) 28.4 
Present work Enc-Dec-LSTM New Delhi, India 100.57 (W/ m?) 31.07 


Abbreviations: “least-squares optimization, ”numerical weather prediction, “weather research forecasting, 
d exponential smoothing, “discrete fourier transform 


5. CONCLUSION AND FUTURE WORK 

This paper attempts to study the encoder-decoder sequence-to-sequence models with attention for 
solar irradiance forecast which was originally developed for natural language processing. Initially datas are 
collected from NSRDB site and processed with sliding window technique and then normalised before applying 
to deep-learning models to improve accuracy. Unwanted features of data are removed using pearson’s corre- 
lation method. Five years of data are supplied for training and one year of data is supplied for testing. Based 
on the experimental results, LSTM based encoder-decoder sequence-to-sequence models with attention mech- 
anism outperforms the other techniques as it combines both encoder-decoder facility and attention mechanism 
which reduces error and improves accuracy, though the computation time of Enc-Dec-LSTM model is higher 
than LSTM. Further the recently developed CNN based hybrid models and transformer models could also be 
studied for solar irradiance forecast. 
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